AITopics | online gradient descent

Reinforcement Learning with Convex Constraints

Neural Information Processing SystemsFeb-12-2026, 20:21:53 GMT

In standard reinforcement learning (RL), a learning agent seeks to optimize the overall reward. However, many key aspects of a desired behavior are more naturally expressed as constraints.

artificial intelligence, machine learning, reinforcement learning, (17 more...)

Neural Information Processing Systems

Country:

North America > United States > Maryland (0.04)
North America > Canada (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)

Add feedback

Non-convexonlinelearningviaalgorithmic equivalence

Neural Information Processing SystemsFeb-10-2026, 15:15:05 GMT

We study an algorithmic equivalence technique between non-convex gradient descent and convex mirror descent.

artificial intelligence, descent, machine learning, (18 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

8b40b4984e6c09ee49333ddd2dc719d4-Paper-Conference.pdf

Neural Information Processing SystemsFeb-10-2026, 15:15:02 GMT

descent, gradient descent, optimization, (15 more...)

Neural Information Processing Systems

Country: Asia > Middle East > Jordan (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Gradient Descent (0.33)

Add feedback

On Tractable \Phi -Equilibria in Non-Concave Games

Neural Information Processing SystemsDec-27-2025, 14:37:10 GMT

While Online Gradient Descent and other no-regret learning procedures are known to efficiently converge to a coarse correlated equilibrium in games where each agent's utility is concave in their own strategy, this is not the case when utilities are non-concave -- a common scenario in machine learning applications involving strategies parameterized by deep neural networks, or when agents' utilities are computed by neural networks, or both. Non-concave games introduce significant game-theoretic and optimization challenges: (i) Nash equilibria may not exist; (ii) local Nash equilibria, though they exist, are intractable; and (iii) mixed Nash, correlated, and coarse correlated equilibria generally have infinite support and are intractable. To sidestep these challenges, we revisit the classical solution concept of $\Phi$-equilibria introduced by Greenwald and Jafari [GJ03], which is guaranteed to exist for an arbitrary set of strategy modifications $\Phi$ even in non-concave games [SL07]. However, the tractability of $\Phi$-equilibria in such games remains elusive. In this paper, we initiate the study of tractable $\Phi$-equilibria in non-concave games and examine several natural families of strategy modifications. We show that when $\Phi$ is finite, there exists an efficient uncoupled learning algorithm that approximates the corresponding $\Phi$-equilibria. Additionally, we explore cases where $\Phi$ is infinite but consists of local modifications, showing that Online Gradient Descent can efficiently approximate $\Phi$-equilibria in non-trivial regimes.

artificial intelligence, machine learning, proceedings, (9 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.82)

Add feedback

Non-convex online learning via algorithmic equivalence

Neural Information Processing SystemsDec-24-2025, 18:10:42 GMT

We study an algorithmic equivalence technique between non-convex gradient descent and convex mirror descent. We start by looking at a harder problem of regret minimization in online non-convex optimization. We show that under certain geometric and smoothness conditions, online gradient descent applied to non-convex functions is an approximation of online mirror descent applied to convex functions under reparameterization. In continuous time, the gradient flow with this reparameterization was shown to be \emph{exactly} equivalent to continuous-time mirror descent by Amid and Warmuth, but theory for the analogous discrete time algorithms is left as an open problem. We prove an $O(T^{\frac{2}{3}})$ regret bound for non-convex online gradient descent in this setting, answering this open problem. Our analysis is based on a new and simple algorithmic equivalence method.

descent, name change, non-convex online, (7 more...)

Neural Information Processing Systems

Industry: Education > Educational Setting > Online (0.44)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Improved Dynamic Regret for Non-degenerate Functions

Lijun Zhang, Tianbao Yang, Jinfeng Yi, Rong Jin, Zhi-Hua Zhou

Neural Information Processing SystemsNov-21-2025, 13:01:02 GMT

Neural Information Processing Systems http://nips.cc/

dynamic regret, learner, sequence, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Iowa (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
(2 more...)

Industry: Education (0.47)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (0.46)

Add feedback

Adaptive Online Learning in Dynamic Environments

Lijun Zhang, Shiyin Lu, Zhi-Hua Zhou

Neural Information Processing SystemsNov-20-2025, 14:27:00 GMT

In the literature, there are two different forms of dynamic regret.

artificial intelligence, dynamic regret, machine learning, (12 more...)

Neural Information Processing Systems

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Asia > China > Jiangsu Province > Nanjing (0.04)
North America > Canada > Quebec > Montreal (0.04)

Industry: Education > Educational Setting > Online (0.42)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Enterprise Applications > Human Resources > Learning Management (0.42)

Add feedback

873be0705c80679f2c71fbf4d872df59-Paper.pdf

Neural Information Processing SystemsOct-3-2025, 03:53:18 GMT

artificial intelligence, machine learning, reinforcement learning, (18 more...)

Neural Information Processing Systems

Country:

North America > United States > Maryland (0.04)
North America > Canada (0.04)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (0.71)

Add feedback

A regret minimization approach to fixed-point iterations

Kwon, Joon

arXiv.org Artificial IntelligenceSep-29-2025

We propose a conversion scheme that turns regret minimizing algorithms into fixed point iterations, with convergence guarantees following from regret bounds. The resulting iterations can be seen as a grand extension of the classical Krasnoselskii--Mann iterations, as the latter are recovered by converting the Online Gradient Descent algorithm. This approach yields new simple iterations for finding fixed points of non-self operators. We also focus on converting algorithms from the AdaGrad family of regret minimizers, and thus obtain fixed point iterations with adaptive guarantees of a new kind. Numerical experiments on various problems demonstrate faster convergence of AdaGrad-based fixed point iterations over Krasnoselskii--Mann iterations.

artificial intelligence, iteration, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2509.21653

Country: